Creation and Evaluation of Extensible Language Resources for Maltese
نویسنده
چکیده
The creation of Language Resources is a labour intensive process whose difficulty is further compounded when minority languages are concerned (Cunningham, 1999). This paper discusses the creation of an extensible set of Language Resources for Maltese developed by the Maltilex Project at the University of Malta (Rosner et. al., 1999), together with quality evaluation mechanisms for minority
منابع مشابه
Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective
The use of Extensible Markup Language (XML) for the annotation of Spoken Language Resources (SLR) is becoming increasingly common these days. Therefore the Speech Processing EXpertise centre (SPEX), which is the SLR validation centre of the European Language Resources Association (ELRA), is also being confronted more with XML. The project “Lexica and Corpora for Speech-to-Speech Translation Com...
متن کاملCrowd-sourcing evaluation of automatically acquired, morphologically related word groupings
The automatic discovery and clustering of morphologically related words is an important problem with several practical applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale lexical resources are available for M...
متن کاملThe development of language resources for Maltese
This paper describes two aspects of the work going on to computerise resources for the Maltese language. The first part describes work on labelling and annotation of spoken Maltese to generate a database suitable for use in deriving speech and speaker recognition tools. It also describes an interactive development system SSUNN that is being used for this work. The second part describes approach...
متن کاملIncorporating an Error Corpus into a Spellchecker for Maltese
This paper discusses the ongoing development of a new Maltese spell checker, highlighting the methodologies which would best suit such a language. We thus discuss several previous attempts, highlighting what we believe to be their weakest point: a lack of attention to context. Two developments are of particular interest, both of which concern the availability of language resources relevant to s...
متن کاملThe Eclipse Annotator: an extensible system for multimodal corpus creation
The Eclipse-Annotator is an extensible tool for the creation of multimodal language resources. It is based on the TASX-Annotator, which has been refactored in order to fit into the plugin based architecture of the new application.
متن کامل